A toolkit for cross-validation: The R package cvTools

نویسنده

  • Andreas Alfons
چکیده

The idea of cross-validation is simple and easy to implement: split the data into several blocks, leave out one block for model estimation, and predict the values of the left-out block. Those predictions are then used to compute a certain prediction loss function. Even though the basic procedure is simple, some additional programming effort is necessary for more complex procedures such as repeated (double) cross-validation, or using cross-validation to select the optimal combination of tuning parameters. While many packages for computing regression models already offer functionality for cross-validation, different packages use different interfaces and the returned objects have a different structure. Furthermore, developers often copy and paste the basic code skeleton when implementing cross-validation for different models, which complicates maintaining the code.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An R package to process LC/MS metabolomic data: MAIT(Metabolite Automatic Identification Toolkit)

Processing metabolomic liquid chromatography and mass spectrometry (LC/MS) data files is time consuming. Currently available R tools allow for only a limited number of processing steps and online tools are hard to use in a programmable fashion. This paper introduces the metabolite automatic identification toolkit MAIT package, which allows users to perform endto-end LC/MS metabolomic data analy...

متن کامل

Translation, Adaptation and Validation of Referral Systems Assessment and Monitoring Toolkit for the Family Physicians Program in Iran

Background and purpose: Studies on the function of referral system in Iran had not covered all aspects and structures of the referral system. This could be due to lack of an appropriate tool that could investigate referral system in Iran. The current study was done to translate and investigate the validation of Referral Systems Assessment and Monitoring (RSAM) Toolkit based on family physician ...

متن کامل

Interpolation over Large Distances Using Spherekit

Spherekit is a spatial interpolation toolkit developed and distributed over the internet by the National Center for Geographic Information and Analysis (NCGIA). A unique feature of the software is its ability to work directly with the spherical geometry of the earth. Thus, distances, areas, and directions are spherically based, and interpolation can be carried out over large distances without d...

متن کامل

CVThresh: R Package for Level-Dependent Cross-Validation Thresholding

The core of the wavelet approach to nonparametric regression is thresholding of wavelet coefficients. This paper reviews a cross-validation method for the selection of the thresholding value in wavelet shrinkage of Oh, Kim, and Lee (2006), and introduces the R package CVThresh implementing details of the calculations for the procedures. This procedure is implemented by coupling a conventional c...

متن کامل

The R Package groc for Generalized Regression on Orthogonal Components

The R package groc for generalized regression on orthogonal components contains functions for the prediction of q responses using a set of p predictors. The primary building block is the grid algorithm used to search for components (projections of the data) which are most dependent on the response. The package offers flexibility in the choice of the dependence measure which can be user-defined....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012